135 research outputs found

    Emulating and evaluating hybrid memory for managed languages on NUMA hardware

    Get PDF
    Non-volatile memory (NVM) has the potential to become a mainstream memory technology and challenge DRAM. Researchers evaluating the speed, endurance, and abstractions of hybrid memories with DRAM and NVM typically use simulation, making it easy to evaluate the impact of different hardware technologies and parameters. Simulation is, however, extremely slow, limiting the applications and datasets in the evaluation. Simulation also precludes critical workloads, especially those written in managed languages such as Java and C#. Good methodology embraces a variety of techniques for evaluating new ideas, expanding the experimental scope, and uncovering new insights. This paper introduces a platform to emulate hybrid memory for managed languages using commodity NUMA servers. Emulation complements simulation but offers richer software experimentation. We use a thread-local socket to emulate DRAM and a remote socket to emulate NVM. We use standard C library routines to allocate heap memory on the DRAM and NVM sockets for use with explicit memory management or garbage collection. We evaluate the emulator using various configurations of write-rationing garbage collectors that improve NVM lifetimes by limiting writes to NVM, using 15 applications and various datasets and workload configurations. We show emulation and simulation confirm each other's trends in terms of writes to NVM for different software configurations, increasing our confidence in predicting future system effects. Emulation brings novel insights, such as the non-linear effects of multi-programmed workloads on NVM writes, and that Java applications write significantly more than their C++ equivalents. We make our software infrastructure publicly available to advance the evaluation of novel memory management schemes on hybrid memories

    MRT-VerĂ€nderungen der weißen Substanz: Assoziation mit Gang- und Exekutivfunktionen, Lebensalter und Parkinson-Erkrankung

    Get PDF
    VerĂ€nderungen der weißen Substanz finden sich hĂ€ufig in der zerebralen MRT, vor allem mit zunehmendem Alter und entstehen auf dem Boden einer zerebralen Mikroangiopathie. Sie werden mit der Entstehung von Behinderungen und mit Störungen der Exekutivfunktion, sowie Gang- und Gleichgewichtsstörungen assoziiert. In der Literatur konnten diese Assoziationen sowohl fĂŒr Ă€ltere Probanden, als auch fĂŒr Patienten mit einer PE nachgewiesen werden. In unserer Studie zur Erfassung von Gang- und Gleichgewichtsstörungen wurde die HĂ€ufigkeit, Ursachen und Risikofaktoren von Gang- und Gleichgewichtsstörungen der Patienten einer akut-neurologischen Klinik untersucht. In der vorliegenden Untersuchung sollte gezeigt werden, inwiefern die Faktoren „höheres Lebensalter“ (hier als Form einer unspezifischen Hirnpathologie) und „Vorliegen einer PE“ (hier als Form der spezifischen Hirnpathologie) assoziiert sind mit bestimmten, durch VerĂ€nderungen der weißen Substanz beeinflussten Defiziten der Gang- und Exekutivfunktionen. Dazu wurden 138 Probanden untersucht (65 junge Personen ohne PE (50-69 Jahre, jPn), 22 jĂŒngere Patienten mit PE (50-69 Jahre, jPE), 36 Ă€ltere Personen ohne PE (70-89 Jahre, Ă€Pn) und 15 Ă€ltere Patienten mit PE (70-89 Jahre, Ă€PE)). Das Vorhandensein und der Schweregrad der VerĂ€nderungen der weißen Substanz wurde anhand der Fazekas-Skala im zerebralen MRT beurteilt. Die Gehgeschwindigkeit unter Single, sowie Dual Task Konditionen mit der Subtraktion in 7er Schritten, sowie der Trail Making Test wurden als Parameter der Gang- und Exekutivfunktionen verwendet. Die Korrelationen zwischen VerĂ€nderungen der weißen Substanz und den Parametern der Gang- und Exekutivfunktion wurden mittels Spearman-Rangkorrelationen berechnet und mittels eines Z-Tests (Fischer Transformation) auf Gruppenunterschiede hin getestet. Die vier Gruppen unterschieden sich nicht relevant hinsichtlich ihrer demographischen und klinischen Parameter, sowie des Schweregrads der VerĂ€nderungen der weißen Substanz. jPn und jPE zeigten signifikante Korrelationen von VerĂ€nderungen der weißen Substanz mit eher einfachen Parametern der Gang- und Exekutivfunktionen, Ă€Pn mit Parametern höherer KomplexitĂ€tsstufe und Ă€PE mit den komplexesten Parametern dieser Untersuchung. In dieser Untersuchung zeigte sich eine deutliche und selektive Assoziation zwischen den durch VerĂ€nderungen der weißen Substanz bedingten Defiziten der Gang- und Exekutivfunktionen und „spezifischen“ (in unserem Beispiel bedingt durch PE) und unspezifischen (altersbedingten) Hirnpathologien. Dieser Zusammenhang kann nicht allein durch das Konzept der kognitiven Reserve erklĂ€rt werden und auch die „posture second“ Theorie erklĂ€rt nur einen Teil der beobachteten Effekte. Teilweise scheint die unterschiedliche LeistungsfĂ€higkeit der Gruppen in den verschiedenen Aufgaben eine Rolle zu spielen. Keiner der dargestellten ErklĂ€rungsansĂ€tze kann das in dieser Untersuchung gefundene Muster von selektiven Assoziationen vollstĂ€ndig erklĂ€ren. Unsere Ergebnisse sollten dennoch in zukĂŒnftigen Studien zur Untersuchung des Einflusses von VerĂ€nderungen der weißen Substanz auf Gang- und Exekutivfunktionen bei Ă€lteren Probanden und Patienten mit neurodegenerativen Erkrankungen BerĂŒcksichtigung finden

    Crystal gazer : profile-driven write-rationing garbage collection for hybrid memories

    Get PDF
    Non-volatile memories (NVM) offer greater capacity than DRAM but suffer from high latency and low write endurance. Hybrid memories combine DRAM and NVM to form scalable memory systems with the promise of high capacity, low energy consumption, and high endurance. Automatically managing hybrid NVM-DRAM memories to achieve their promise without changing user applications or their programming models remains an open question. This paper uses garbage collection in managed languages to exploit NVM capacity while preventing NVM wear out in hybrid memories with no changes to the programming model. We introduce profile-driven write-rationing garbage collection. Allocation sites that produce frequently written objects are predicted based on previous program executions. Objects are initially allocated in a DRAM nursery space. The collector copies surviving nursery objects from highly written sites to a mature DRAM space and read-mostly objects to a mature NVM space.Write-intensity prediction for 15 Java benchmarks accurately places objects in the correct space, eliminating expensive object monitoring from prior write-rationing garbage collectors. Furthermore, our technique exposes a Pareto tradeoff between DRAM usage and NVM lifetime, unlike prior work. Experimental results on NUMA hardware that emulates hybrid NVM-DRAM memory demonstrates that profile-driven write-rationing garbage collection reduces the number of writes to NVM compared to prior work to extend its lifetime, maximizes the use of NVM for its capacity, and achieves good performance

    Cooperative cache scrubbing

    Get PDF
    Managing the limited resources of power and memory bandwidth while improving performance on multicore hardware is challeng-ing. In particular, more cores demand more memory bandwidth, and multi-threaded applications increasingly stress memory sys-tems, leading to more energy consumption. However, we demon-strate that not all memory traffic is necessary. For modern Java pro-grams, 10 to 60 % of DRAM writes are useless, because the data on these lines are dead- the program is guaranteed to never read them again. Furthermore, reading memory only to immediately zero ini-tialize it wastes bandwidth. We propose a software/hardware coop-erative solution: the memory manager communicates dead and zero lines with cache scrubbing instructions. We show how scrubbing instructions satisfy MESI cache coherence protocol invariants and demonstrate them in a Java Virtual Machine and multicore simula-tor. Scrubbing reduces average DRAM traffic by 59%, total DRAM energy by 14%, and dynamic DRAM energy by 57 % on a range of configurations. Cooperative software/hardware cache scrubbing reduces memory bandwidth and improves energy efficiency, two critical problems in modern systems

    Automatic design of domain-specific instructions for low-power processors

    Get PDF
    This paper explores hardware specialization of low­ power processors to improve performance and energy efficiency. Our main contribution is an automated framework that analyzes instruction sequences of applications within a domain at the loop body level and identifies exactly and partially-matching sequences across applications that can become custom instructions. Our framework transforms sequences to a new code abstraction, a Merging Diagram, that improves similarity identification, clusters alike groups of potential custom instructions to effectively reduce the search space, and selects merged custom instructions to efficiently exploit the available customizable area. For a set of 11 media applications, our fast framework generates instructions that significantly improve the energy-delay product and speed­ up, achieving more than double the savings as compared to a technique analyzing sequences within basic blocks. This paper shows that partially-matched custom instructions, which do not significantly increase design time, are crucial to achieving higher energy efficiency at limited hardware areas

    Intensive rehabilitation programme for patients with subacute stroke in an inpatient rehabilitation facility: describing a protocol of a prospective cohort study

    Get PDF
    Accident cerebrovascular; RehabilitaciĂł; Estudi prospectiuAccidente cerebrovascular; RehabilitaciĂłn; Estudio prospectivoRehabilitation; Stroke; Prospective StudiesRehabilitation is recognised as a cornerstone of multidisciplinary stroke care. Intensity of therapy is related to functional recovery although there is high variability on the amount of time and techniques applied in therapy sessions. There is a need to better describe stroke rehabilitation protocols to develop a better understanding of current practice increasing the internal validity and generalisation of clinical trial results. The aim of this study is to describe an intensive rehabilitation programme for patients with stroke in an inpatient rehabilitation facility, measuring the amount and type of therapies (physical, occupational and speech therapy) provided and reporting functional outcomes. Methods and analysis: This will be a prospective observational cohort study of patients with subacute stroke admitted to our inpatient rehabilitation facility during 2 years. A therapy recording tool was developed in order to describe the rehabilitation interventions performed in our unit. This tool was designed using the Delphi method, literature search and collaboration with senior clinicians. Therapists will record the time spent on different activities available in our unit during specific therapy sessions. Afterwards, the total time spent in each activity, and the total rehabilitation time for all activities, will be averaged for all patients. Outcome variables were divided into three different domains: body structure and function outcomes, activity outcomes and participation outcomes and will be assessed at baseline (admission at the rehabilitation unit), at discharge from the rehabilitation unit and at 3 and 6 months after stroke. Ethics and dissemination: This study was approved by the Medical Research Committee at Hospital del Mar Research Institute (Project ID: 34/C/2017). The results of this study will be presented at national and international congress and submitted for publication in peer-reviewed journals.This Project is funded by FundaciĂł La MaratĂł TV

    Just-in-Time Data Structures

    Get PDF
    Today, software engineering practices focus on finding the single "right" data representation (i.e., data structure) for a program. The right data representation, however, might not exist: relying on a single representation of the data for the lifetime of the program can be suboptimal in terms of performance. We explore the idea of developing data structures for which changing the data representation is an intrinsic property. To this end we introduce Just-in-Time Data Structures, which enable representation changes at runtime, based on declarative input from a performance expert programmer. Just-in-Time Data Structures are an attempt to shift the focus from finding the "right" data structure to finding the right sequence of data representations. We present JitDS-Java, an extension to the Java language, to develop Just-in-Time Data Structures. Further, we show two example programs that benefit from changing the representation at runtime

    A new method to remove hybridization bias for interspecies comparison of global gene expression profiles uncovers an association between mRNA sequence divergence and differential gene expression in Xenopus

    Get PDF
    The recent sequencing of a large number of Xenopus tropicalis expressed sequences has allowed development of a high-throughput approach to study Xenopus global RNA gene expression. We examined the global gene expression similarities and differences between the historically significant Xenopus laevis model system and the increasingly used X.tropicalis model system and assessed whether an X.tropicalis microarray platform can be used for X.laevis. These closely related species were also used to investigate a more general question: is there an association between mRNA sequence divergence and differences in gene expression levels? We carried out a comprehensive comparison of global gene expression profiles using microarrays of different tissues and developmental stages of X.laevis and X.tropicalis. We (i) show that the X.tropicalis probes provide an efficacious microarray platform for X.laevis, (ii) describe methods to compare interspecies mRNA profiles that correct differences in hybridization efficiency and (iii) show independently of hybridization bias that as mRNA sequence divergence increases between X.laevis and X.tropicalis differences in mRNA expression levels also increase

    BugSifter: A Generalized Accelerator for Flexible Instruction-Grain Monitoring

    Get PDF
    Software robustness is an ever-challenging problem in the face of today's evolving software and hardware that has undergone recent shifts. Instruction-grain monitoring is a powerful approach for improved software robustness that affords comprehensive runtime coverage for a wide spectrum of bugs and security exploits. Unfortunately, existing instruction-grain monitoring frameworks, such as dynamic binary instrumentation, are either prohibitively expensive (slowing down applications by an order of magnitude or more) or offer limited coverage. This work introduces BugSifter, a new design that drastically decreases monitoring overhead without sacrificing flexibility or bug coverage. The main overhead of instruction-grain monitoring lies in execution of software event handlers to monitor nearly every application instruction to check for bugs. BugSifter identifies common monitoring activities that result in redundant monitoring actions, and filters them using general, light-weight hardware, eliminating the majority of costly software event handlers. Our proposed design filters 80-98% of events while monitoring for a variety of commonly-occurring bugs, delegating the rest to flexible software handlers. BugSifter significantly reduces the overhead of instruction-grain monitoring to an average of 40% over unmonitored application time. BugSifter makes instruction-grain monitoring practical, enabling efficient and timely detection of a wide range of bugs, thus making software more robust
    • 

    corecore